Fix #21619: Refactor NotNullInfo to record every reference which is retracted once. #21624

noti0na1 · 2024-09-21T17:58:09Z

This PR improves the flow typing with try block.

Since we don't know at which point an exception is thrown in the body, we have to collect any reference that is once retracted.

olhotak · 2024-09-22T01:49:00Z

I think this is the right idea but reasoning about three sets is hard and it's hard to convince oneself that the implementation is correct for all cases.

I wonder if a simpler implementation would be possible with only two sets. Let's still call them asserted and retracted. We lift the invariant that asserted & retracted is empty, so it's possible for one variable to be in both. asserted is the set that we know is non-null at the very end of evaluating the expression (if no exception happens). retracted is the set that is retracted anywhere within the expression, even if it is later asserted. When we assert a variable, we add it to asserted but do not remove it from retracted. When we retract a variable, we add it to retracted and remove it from asserted. If we evaluate the expression to the very end, if a variable is in both asserted and retracted, we consider it asserted. When we're at a control flow point where the expression may not have executed to the end (because it threw an exception), we just clear the asserted set and keep only the retracted set. At control flow merges, we intersect asserted and union retracted. For a sequential composition (a1,r1) ; (a2,r2), we get ((a1-r2) | a2, r1|r2).

Do you think this would work or do you see any flaw in it?

I think maybe what I've described above are the current sets asserted and onceRetracted in this pull request. I guess the question is, if we have asserted and onceRetracted, can we get rid of retracted? (Then eventually we could rename onceRetracted to just retracted, to make the name shorter and simpler.)

olhotak · 2024-09-22T01:53:31Z

In the above proposal:

retracted means that the variable is set to null somewhere, anywhere in the expression
asserted means that the variable is set to non-null after all program points that could set it to null (if any)

noti0na1 · 2024-09-22T15:09:29Z

retracted means that the variable is set to null somewhere, anywhere in the expression

asserted means that the variable is set to non-null after all program points that could set it to null (if any)

This rule looks reasonable to me. I will think about more examples to double-check this.

noti0na1 · 2024-09-22T15:13:11Z

TODO: some other improvements I want for flow typing but haven't been implemented yet:

For definition like var a: String | Null = "", we should add a to asserted info immediately .
We only considered Nothing body for if right now. We should consider this for other control expressions as well.

olhotak · 2024-10-11T14:19:33Z

compiler/src/dotty/tools/dotc/typer/Nullables.scala


    /** The alternative path combination with another not-null info. Used to merge
     *  the nullability info of the two branches of an if.
     */
    def alt(that: NotNullInfo): NotNullInfo =
      NotNullInfo(this.asserted.intersect(that.asserted), this.retracted.union(that.retracted))

+    def withRetracted(that: NotNullInfo): NotNullInfo =


Here it would be useful to have a comment explaining when this function is intended to be used. The name explains what the function does, but the idea is to say why one would want to do that. (Compare with the alt function, whose name says when you want it.) It looks like the use case is similar to alt, but in a case when one branch is known to not complete execution normally (e.g. an infinite loop), only exceptionally. We can say that in the comment and/or consider incorporating that into the name somehow.

This also raises the question of what to do when both alternatives cannot complete normally. Theoretically one would want to assert all variables, but we don't have a way to do that in the implementation. Perhaps that case is not common enough in practice to care about.

I don't understand why we want to assert all variables if none of the alternatives complete normally? Do you have an example?

In general the analysis proves the following condition: in every execution trace leading to a given program point $p$, variable x is not null. When the code before $p$ does not terminate, there are no traces leading to $p$ so the condition is true for every variable. If there were an instruction derefencing x at $p$, that instruction would never execute, so it would be safe to assume that x is not null.

The specific case of withRetracted handles something like alt, but where the that branch does not complete normally. It is defined as <a1,r1>.withRetracted(<a2,r2>) = <a1, r1 union r2>. alt is defined as <a1,r1>.alt(<a2,r2>) = <a1 intersect a2, r1 union r2>. If we could represent the universal set in a2, then a1 intersect a2 would be a1, so we could implement the equivalent of withRetracted using alt.

I don't know whether actual use cases in practice justify the code complexity of adding a representation of the universal set to the implementation, but it is cleaner in theory.

If we don't do that, then the comment/name of withRetracted should suggest that it is like alt, but for the case where that cannot complete normally.

compiler/src/dotty/tools/dotc/typer/Nullables.scala

olhotak · 2024-10-11T14:31:10Z

compiler/src/dotty/tools/dotc/typer/Typer.scala

+      nnInfo = nothingCases.foldLeft(nnInfo):
+        (nni, c) => nni.withRetracted(c.notNullInfo)
+      if normalCases.nonEmpty then
+        nnInfo = nnInfo.seq(normalCases.map(_.notNullInfo).reduce(_.alt(_)))


This seems an abuse of seq, since there is no sequencing with the nothingCases; conceptually it is alt.

A conceptually cleaner approach would be to handle the normalCases first, then apply the withRetracted for the nothingCases to the result of the normalCases.

compiler/src/dotty/tools/dotc/typer/Typer.scala

tests/explicit-nulls/neg/i21380c.scala

Co-authored-by: Ondřej Lhoták <[email protected]>

noti0na1 · 2024-11-04T16:25:03Z

I tried to implemented a TerminatedInfo, whose asserted set acts like a universal set, but I don't like the result.

Ideally with the TerminatedInfo, we can analyse more cases where statements are followed by a terminating expression (type Nothing), like return, throw, and other applications with result type Nothing. However, the implementation becomes much more complicated, without significant improvement. We still cannot analyse any deeply-nested exceptions. In addition, I don't like the behaviour of treating all refs non-nullable after a termination point, even though I understand it is theoretical ok to do so.

If we ignore these Nothing` expressions, we would only lose some specific tracking, and I think it is an acceptable tradeoff.

val s: String | Null
if ... then 
  s = null
  return
else
  s = ""

s // treating s as non-nullable or not?

olhotak · 2024-11-04T16:47:53Z

OK, I don't think an Info with an asserted set that acts like a universal set is necessary.

noti0na1 added the area:nullability label Sep 21, 2024

noti0na1 marked this pull request as ready for review September 21, 2024 23:48

noti0na1 requested a review from olhotak September 21, 2024 23:48

noti0na1 assigned olhotak Sep 21, 2024

olhotak assigned noti0na1 and unassigned olhotak Sep 22, 2024

noti0na1 added 3 commits October 11, 2024 05:58

Refactor NotNullInfo to record every reference which is retracted once.

404098f

Use a different rule for NotNullInfo

273c867

Consider cases with Nothing type

0a319be

noti0na1 force-pushed the fix-21619 branch from 4ab4b0a to 0a319be Compare October 11, 2024 04:24

noti0na1 assigned olhotak and unassigned noti0na1 Oct 11, 2024

olhotak reviewed Oct 11, 2024

View reviewed changes

olhotak assigned noti0na1 and unassigned olhotak Oct 11, 2024

Apply suggestions from code review

ada9da3

Co-authored-by: Ondřej Lhoták <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #21619: Refactor NotNullInfo to record every reference which is retracted once. #21624

Fix #21619: Refactor NotNullInfo to record every reference which is retracted once. #21624

noti0na1 commented Sep 21, 2024

olhotak commented Sep 22, 2024

olhotak commented Sep 22, 2024

noti0na1 commented Sep 22, 2024

noti0na1 commented Sep 22, 2024 •

edited

Loading

olhotak Oct 11, 2024

noti0na1 Oct 14, 2024

olhotak Oct 14, 2024

olhotak Oct 11, 2024

noti0na1 commented Nov 4, 2024 •

edited

Loading

olhotak commented Nov 4, 2024

Fix #21619: Refactor NotNullInfo to record every reference which is retracted once. #21624

Are you sure you want to change the base?

Fix #21619: Refactor NotNullInfo to record every reference which is retracted once. #21624

Conversation

noti0na1 commented Sep 21, 2024

olhotak commented Sep 22, 2024

olhotak commented Sep 22, 2024

noti0na1 commented Sep 22, 2024

noti0na1 commented Sep 22, 2024 • edited Loading

olhotak Oct 11, 2024

Choose a reason for hiding this comment

noti0na1 Oct 14, 2024

Choose a reason for hiding this comment

olhotak Oct 14, 2024

Choose a reason for hiding this comment

olhotak Oct 11, 2024

Choose a reason for hiding this comment

noti0na1 commented Nov 4, 2024 • edited Loading

olhotak commented Nov 4, 2024

noti0na1 commented Sep 22, 2024 •

edited

Loading

noti0na1 commented Nov 4, 2024 •

edited

Loading